1 research outputs found

    Investigating Combining Quantitative And Textual Causal Knowledge In Learning Causal Structure

    Get PDF
    The study of causes and effects in large systems such as meteorology, biochemistry, finance, and sociology plays a critical role in predicting future developments and possible interventions. In the last decades, several new techniques and algorithms have been developed to discover causal structures in multivariate quantitative datasets. Yet, solely determining causal structure from observations is challenging and often yields ambiguous results. Additional knowledge from other sources is likely to be beneficial. Recently emerging large-scale language models are showing impressive results in the field of natural language processing (NLP). One task in the field of NLP is to extract causal relations from text. Combining these with causal discovery algorithms could be advantageous. This bachelor thesis investigates the combination of causal structures from quantitative and qualitative sources. A feasibility study was conducted on two datasets; (1) a biochemistry flow cytometry dataset and (2) a self-collected financial dataset. During this process, a common framework was developed that enables the combination of both sources. Considerations and problems were monitored and improvements suggested. A focus laid upon visualizing the evidences with different Python and R libraries. In principle, it is possible to combine both domains. However, it was found, that a lack of training data for causal relation extraction exists. Knowledge graphs with an underlying ontology need to be leveraged to account for lexically different terms of the same entity. To improve the results from the qualitative data, it would be advantageous to extract events rather than causal relations. This thesis makes a valuable contribution to the study of integrating quantitative and qualitative causal knowledge by applying various methods to two distinct datasets from different domains. Furthermore, it addresses a research gap, as there is limited existing literature in this specific area to the best of my knowledge
    corecore